A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers
نویسندگان
چکیده
The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and then identifying the intra-chunk dependency relations. The second step involves learning to identify the inter-chunk dependency relations. For this, we use an initialization technique based on a measure we call Normalized Conditional Mutual Information (NCMI), in addition to a few linguistic constraints. We present the results for Hindi. We have achieved a precision of 80.83% for sentences of size less than 10 words and 66.71% overall. This is significantly better than the baseline in which random initialization is used.
منابع مشابه
Parse Imputation for Dependency Annotations
Syntactic annotation is a hard task, but it can be made easier by allowing annotators flexibility to leave aspects of a sentence underspecified. Unfortunately, partial annotations are not typically directly usable for training parsers. We describe a method for imputing missing dependencies from sentences that have been partially annotated using the Graph Fragment Language, such that a standard ...
متن کاملDoes it have to be trees?: data-driven dependency parsing with incomplete and noisy training data
We present a novel approach to training data-driven dependency parsers on incomplete annotations. Our parsers are simple modifications of two well-known dependency parsers, the transition-based Malt parser and the graph-based MST parser. While previous work on parsing with incomplete data has typically couched the task in frameworks of unsupervised or semi-supervised machine learning, we essent...
متن کاملEffective Greedy Inference for Graph-based Non-Projective Dependency Parsing
Exact inference in high-order graph-based non-projective dependency parsing is intractable. Hence, sophisticated approximation techniques based on algorithms such as belief propagation and dual decomposition have been employed. In contrast, we propose a simple greedy search approximation for this problem which is very intuitive and easy to implement. We implement the algorithm within the second...
متن کاملData-Driven Dependency Parsing of New Languages Using Incomplete and Noisy Training Data
We present a simple but very effective approach to identifying high-quality data in noisy data sets for structured problems like parsing, by greedily exploiting partial structures. We analyze our approach in an annotation projection framework for dependency trees, and show how dependency parsers from two different paradigms (graph-based and transition-based) can be trained on the resulting tree...
متن کاملExperiments in Newswire-to-Law Adaptation of Graph-Based Dependency Parsers
We evaluate two very different methods for domain adaptation of graph-based dependency parsers on the EVALITA 2011 Domain Adaptation data, namely instance-weighting [10] and self-training [9, 6]. Since the source and target domains (newswire and law, respectively) were very similar, instance-weighting was unlikely to be efficient, but some of the semi-supervised approaches led to significant im...
متن کامل